-
Notifications
You must be signed in to change notification settings - Fork 15.1k
clang/AMDGPU: Emit grid size builtins with range metadata #113038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang/AMDGPU: Emit grid size builtins with range metadata #113038
Conversation
|
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesThese cannot be 0. Full diff: https://github.com/llvm/llvm-project/pull/113038.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 28f28c70b5ae52..69a7dfc2433ae8 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18538,6 +18538,12 @@ Value *EmitAMDGPUGridSize(CodeGenFunction &CGF, unsigned Index) {
auto *GEP = CGF.Builder.CreateGEP(CGF.Int8Ty, DP, Offset);
auto *LD = CGF.Builder.CreateLoad(
Address(GEP, CGF.Int32Ty, CharUnits::fromQuantity(4)));
+
+ llvm::MDBuilder MDB(CGF.getLLVMContext());
+
+ // Known non-zero.
+ LD->setMetadata(llvm::LLVMContext::MD_range,
+ MDB.createRange(APInt(32, 1), APInt::getZero(32)));
LD->setMetadata(llvm::LLVMContext::MD_invariant_load,
llvm::MDNode::get(CGF.getLLVMContext(), std::nullopt));
return LD;
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
index bf5f2971cf118c..be6cee5e9217bf 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
@@ -639,7 +639,7 @@ void test_get_workgroup_size(int d, global int *out)
// CHECK-LABEL: @test_get_grid_size(
// CHECK: {{.*}}call align 4 dereferenceable(64){{.*}} ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %.sink
-// CHECK: load i32, ptr addrspace(4) %{{.*}}, align 4, !invariant.load
+// CHECK: load i32, ptr addrspace(4) %{{.*}}, align 4, !range [[$GRID_RANGE:![0-9]+]], !invariant.load
void test_get_grid_size(int d, global int *out)
{
switch (d) {
@@ -896,5 +896,6 @@ void test_set_fpenv(unsigned long env) {
__builtin_amdgcn_set_fpenv(env);
}
+// CHECK-DAG: [[$GRID_RANGE]] = !{i32 1, i32 0}
// CHECK-DAG: [[$WS_RANGE]] = !{i16 1, i16 1025}
// CHECK-DAG: attributes #[[$NOUNWIND_READONLY]] = { convergent mustprogress nocallback nofree nounwind willreturn memory(none) }
|
| __builtin_amdgcn_set_fpenv(env); | ||
| } | ||
|
|
||
| // CHECK-DAG: [[$GRID_RANGE]] = !{i32 1, i32 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the upper bound is smaller than the lower bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is how you are supposed to represent the wrapped set where the 0 value isn't allowed but the uint32_max is. See ConstantRange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I c. range does allow wrap.
297173f to
5336b21
Compare
708215d to
79c3169
Compare
5336b21 to
0bd73d5
Compare
79c3169 to
6e74018
Compare
6e74018 to
6981d5a
Compare
This was specifically a reorder. I've seen this in some reorder cases, but not others
I am not controlling the process in any way |
These cannot be 0.
6981d5a to
2e3964f
Compare


These cannot be 0.